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Docket No. GC 381 



PATENT 



PROTEASES FROM GRAM-POSITIVE ORGANISMS 



5 



FIELD OF THE INVENTION 



The present invention relates to cysteine proteases derived from gram-positive 
microorganisms. The present invention provides nucleic acid and amino acid 
sequences of cysteine protease 1, 2 and 3 identified in Bacillus. The present invention 
10 also provides methods for the production of cysteine protease 1 , 2 and 3 in host cells 
as well as the production of heterologous proteins in a host cell having a mutation or 
deletion of part or all of at least one of the cysteine proteases of the present invention. 



been used for large-scale industrial fermentation due, in part, to their ability to secrete 
their fermentation products into the culture media. In gram-positive bacteria, secreted 
proteins are exported across a cell membrane and a cell wall, and then are 
subsequently released into the external media usually maintaining their native 

20 conformation. 

Various gram-positive microorganisms are known to secrete extracellular 
and/or intracellular protease at some stage in their life cycles. Many proteases are 
produced in large quantities for industrial purposes. A negative aspect of the presence 
of proteases in gram-positive organisms is their contribution to the overall degradation 

25 of secreted heterologous or foreign proteins. 

The classification of proteases found in microorganisms is based on their 
catalytic mechanism which results in four groups: the serine proteases; 
metalloproteases; cysteine proteases; and aspartic proteases. These categories can be 
distinguished by their sensitivity to various inhibitors. For example, the serine 

30 proteases are inhibited by phenylmethylsulfonylfluoride (PMSF) and 

diisopropylfluorophosphate (DIFP); the metalloproteases by chelating agents; the 
cysteine enzymes by iodoacetamide and heavy metals and the aspartic proteases by 
pepstatin. The serine proteases have alkaline pH optima, the metalloproteases are 
optimally active around neutrality, and the cysteine and aspartic enzymes have acidic 

35 pH optima ( Biotechnology Handbooks, Bacillus , vol. 2, edited by Harwood, 1989 
Plenum Press, New York). 

The activity of cysteine protease depends on a catalytic dyad of cysteine and 
histidine with the order differing between families. The best known family of 



BACKGROUND OF THE INVENTION 



15 



Gram-positive microorganisms, such as members of the group Bacillus, have 
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cysteine proteases is that of papain having catalytic residues Cys-25 and His- 1 59. 
Cysteine proteases of the papain family catalyze the hydrolysis of peptide, amide, 
ester, thiol ester and thiono ester bonds. Naturally occurring inhibitors of cysteine 
proteases of the papain family are those of the cystatin family (Methods in 
5 Enzymology, vol. 244, Academic Press, Inc. 1994). 

SUMMARY OF THE INVENTION 

The present invention relates to the unexpected and surprising discovery of 
three heretofore unknown or unrecognized cysteine proteases found in Bacillus 

10 subtilis, designated herein as CP1 , CP2 and CP3 having the nucleic acid and amino 
acid as shown in Figures 1 A- IB, Figures 5A-5B and 6A-6B, respectively. The present 
invention is based, in part, upon the presence of the characteristic cysteine protease 
amino acid motif GXCWAF found in uncharacterised translated genomic nucleic acid 
sequences of Bacillus subtilis. The present invention is also based in part upon the 

15 structural relatedness that CP1 has with the cysteine protease papain specifically with 
respect to the location of the catalytic histidine/alanine and asparagine/serine residues 
and the structural relatedness that CP1 has with CP2 and CP3. 

The present invention provides isolated polynucleotide and amino acid 
sequences for CP1 , CP2 and CP3. Due to the degeneracy of the genetic code, the 

20 present invention encompasses any nucleic acid sequence that encodes the CP1, CP2 
and CP3 amino acid sequence shown in the Figures. 

The present invention encompasses amino acid variations of B. subtilis CP1, 
CP2 and CP3 amino acids disclosed herein that have proteolytic activity. B. subtilis 
CP1, CP2 and CP3 as well as proteolytically active amino acid variations, thereof 

25 have application in cleaning compositions. The present invention also encompasses 
amino acid variations or derivatives of CP1, CP2, CP3 that do not have the 
characteristic proteolytic activity as long as the nucleic acid sequences encoding such 
variations or derivatives would have sufficient 5' and 3' coding regions to be capable 
of being integrated into a gram-positive organism genome. Such variants would have 

30 applications in gram-positive expression systems where it is desirable to delete, 
mutate, alter or otherwise incapacitate the naturally occurring cysteine protease in 
order to diminish or delete its proteolytic activity. Such an expression system would 
have the advantage of allowing for greater yields of recombinant heterologous 
proteins or polypeptides. 

35 The present invention provides methods for detecting gram positive 

microorganism homologs of B. subtilis CP1, CP2 and CP3 that comprises hybridizing 
part or all of the nucleic acid encoding B, subtilis CP1, CP2 and CP3 with nucleic acid 
derived from gram-positive organisms, either of genomic or cDNA origin. In one 
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embodiment, the gram-positive microorganism is selected from the group consisting 
of B. licheniformis, B. lentus, B. brevis, B. stearothermophilus, B. alkalophilus, B. 
amyloliquefaciens, B. coagulans, B. circulans, B. lautus and Bacillus thuringiensis . 
In yet another aspect, the present invention provides a gram-positive 
5 microorganism having a mutation or deletion of part or all of the gene encoding CP1 
and/or CP2 and/or CP3, which results in the inactivation of the CP1 and/or CP2 
and/or CP3 proteolytic activity, either alone or in combination with mutations in other 
proteases, such as apr, npr, epr, mpr for example, or other proteases known to those of 
skill in the art. In one embodiment of the present invention, the gram-positive 
10 organism is a member of the genus Bacillus. In another embodiment, the Bacillus is 
Bacillus subtilis. 

The production of desired heterologous proteins or polypeptides in gram- 
positive microorganisms may be hindered by the presence of one or more proteases 
which degrade the produced heterologous protein or polypeptide. One advantage of 
15 the present invention is that it provides methods and expression systems which can be 
used to prevent that degradation, thereby enhancing yields of the desired heterologous 
protein or polypeptide.In another aspect, the gram-positive host having one or more 
cysteine protease deletions is further genetically engineered to produce a desired 
protein. 

20 In one embodiment of the present invention, the desired protein is 

heterologous to the gram-positive host cell. In another embodiment, the desired 
protein is homologous to the host cell. The present invention encompasses a gram- 
positive host cell having a deletion or interruption of the nucleic acid encoding the 
naturally occurring homologous protein, such as a protease, and having nucleic acid 

25 encoding the homologous protein re-introduced in a recombinant form. In another 

embodiment, the host cell produces the homologous protein. Accordingly, the present 
invention also provides methods and expression systems for reducing degradation of 
heterologous proteins produced in gram-positive microorganisms. The gram-positive 
microorganism may be normally sporulating or non-sporulating. 

30 In a further aspect of the present invention, gram-positive CP1, CP2 or CP3 is 

produced on an industrial fermentation scale in a microbial host expression system. In 
another aspect, isolated and purified recombinant CP1, CP2 or CP3 is used in 
compositions of matter intended for cleaning purposes, such as detergents. 
Accordingly, the present invention provides a cleaning composition comprises one or 

35 more of a gram-positive cysteine protease selected from the group consisting of CP1, 
CP2 and CP3. The cysteine protease may be used alone or in combination with other 
enzymes and/or mediators or enhancers. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 A- IB shows the DNA and amino acid sequence for CP1 (YJDE). 

Figure 2 shows an amino acid alignment with papain (accession number 
5 papa carpa.p) with the cysteine protease CP1 , designated YJDE. For Figures 2, 3 and 
4, the motif GXCWAF has been marked along with the catalytic cysteine and the 
conserved catalytic histidine/alanine and asparagine/serine residues. 

Figure 3 shows amino acid alignment of CP1 (YJDE) with CP3 (PMI). 

10 

Figure 4 shows the amino acid alignment of CP1 (YJDE) with CP2 (YdhS). 

Figure 5A-5B shows the amino acid and nucleic acid sequence for CP2 (YdhS). 

15 Figure 6A-6B shows the amino acid and nucleic acid sequence for CP3 (PMI). 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 
Definitions 

As used herein, the genus Bacillus includes all members known to those of 

20 skill in the art, including but not limited to B. subtilis, B. licheniformis, B. lentus, B. 
brevis, B. stearothermophilus, B. alkalophilus, B. amyloliquefaciens, B. coagulans, B. 
ciculans, B. lautus and B. thuringiensis . 

The present invention encompasses novel CP1, CP2 and CP3 from gram 
positive organisms. In a preferred embodiment, the gram-positive organisms is a 

25 Bacillus. In another preferred embodiment, the gram-positive organism is Bacillus 
subtilis. As used herein, "B. subtilis CP1 , CP2 or CP3" refers to the amino acid 
sequences shown in Figures. Figures 1 A- IB show the amino acid and nucleic acid 
seqeunce for CP1 (YJDE); Figures 5A-5B show the amino acid and nucleic acid 
sequence for CP2 (YDHS); and Figures 6A-6B show the amino acid and nucleic acid 

30 sequences for CP3 (PMI). The present invention encompasses amino acid variations 
of the amino acid sequences disclosed in Figures 1 A- IB and 5A-5B and 6A-6B that 
have proteolytic activity. Such proteolytic amino acid variants can be used in cleaning 
compositions. The present invention also encompasses B. subtilis amino acid 
variations or derivatives that are not proteolytically active. DNA encoding such 

35 variants can be used in methods designed to delete or mutate the naturally occurring 
host cell CP1, CP or CP3. 

As used herein, "nucleic acid" refers to a nucleotide or polynucleotide 
sequence, and fragments or portions thereof, and to DNA or RNA of genomic or 
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synthetic origin which may be double-stranded or single-stranded, whether 
representing the sense or antisense strand. As used herein "amino acid" refers to 
peptide or protein sequences or portions thereof. A "polynucleotide homolog" as 
used herein refers to a gram-positive microorganism polynucleotide that has at least 
5 80%, at least 90% and at least 95% identity to B.subtilis CP1 , CP2 or CP3, or which is 
capable of hybridizing to B.subtilis CP1, CP2 or CP3 under conditions of high 
stringency and which encodes an amino acid sequence having cysteine protease 
activity. 

The terms "isolated" or "purified" as used herein refer to a nucleic acid or 
10 amino acid that is removed from at least one component with which it is naturally 
associated. 

As used herein, the term "heterologous protein" refers to a protein or 
polypeptide that does not naturally occur in a gram-positive host cell. Examples of 
heterologous proteins include enzymes such as hydrolases including proteases, 

15 cellulases, amylases, carbohydrases, and lipases; isomerases such as racemases, 
epimerases, tautomerases, or mutases; transferases, kinases and phophatases. The 
heterologous gene may encode therapeutically significant proteins or peptides, such as 
growth factors, cytokines, ligands, receptors and inhibitors, as well as vaccines and 
antibodies. The gene may encode commercially important industrial proteins or 

20 peptides, such as proteases, carbohydrases such as amylases and glucoamylases, 
cellulases, oxidases and lipases. The gene of interest may be a naturally occurring 
gene, a mutated gene or a synthetic gene. 

The term "homologous protein" refers to a protein or polypeptide native or 
naturally occurring in a gram-positive host cell. The invention includes host cells 

25 producing the homologous protein via recombinant DNA technology. The present 
invention encompasses a gram-positive host cell having a deletion or interruption of 
the nucleic acid encoding the naturally occurring homologous protein, such as a 
protease, and having nucleic acid encoding the homologous protein re-introduced in a 
recombinant form. In another embodiment, the host cell produces the homologous 

30 protein. 

As used herein, the term "overexpressing" when refering to the production of 
a protein in a host cell means that the protein is produced in greater amounts than its 
production in its naturally occurring environment. 

As used herein, the phrase "proteolytic activity" refers to a protein that is able 
35 to hydrolyze a peptide bond. Enzymes having proteolytic activity are described in 
Enzyme Nomenclature, 1992, edited Webb Academic Press, Inc. 
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Detailed Description of the Preferred Embodiments 

The unexpected discovery of the cysteine proteases CP1, CP2 and CP3 in 
B.subtilis provides a basis for producing host cells, expression methods and systems 
which can be used to prevent the degradation of recombinantly produced heterologous 
5 proteins. In a preferred embodiment, the host cell is a gram-positive host cell that has 
a deletion or mutation in the naturally occurring cysteine protease said mutation 
resulting in deletion or inactivation of the production by the host cell of the proteolytic 
cysteine protease gene product. The host cell may additionally be genetically 
engineered to produced a desired protein or polypeptide. 
10 It may also be desired to genetically engineer host cells of any type to produce 

a gram-positive cysteine protease. Such host cells are used in large scale fermentation 
to produce large quantities of the cysteine protease which may be isolated or purified 
and used in cleaning products, such as detergents. 

15 I. Cysteine Protease Sequences 

The CP1 , CP2 and CP3 polynucleotides having the sequences as shown in 
Figures 1 A-1B, 5A-5B and 6A-6B, respectively, encode the Bacillus subtilis cysteine 
proteases CP1, CP2 and CP3. As will be understood by the skilled artisan, due to the 
degeneracy of the genetic code, a variety of polynucleotides can encode the Bacillus 

20 subtilis CP1, CP2 and CP3. The present invention encompasses all such 
polynucleotides. 

The present invention encompasses CP1, CP2 and CP3 polynucleotide 
homologs encoding gram-positive microorganism cysteine proteases CP1, CP2 and 
CP3, respectively, which have at least 80%, or at least 90% or at least 95% identity to 
25 B.subtilis CP1 , CP2 and CP3 as long as the homolog encodes a protein that has 
proteolytic activity. 

Gram-positive polynucleotide homologs of B.subtilis CP1, CP2 or CP3 may be 
obtained by standard procedures known in the art from, for example, cloned DNA (e.g., a 
DNA "library"), genomic DNA libraries, by chemical synthesis once identified, by cDNA 

30 cloning, or by the cloning of genomic DNA, or fragments thereof, purified from a desired 
cell. (See, for example, Sambrook et aL 9 1989, Molecular Cloning, A Laboratory 
Manual, 2d Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York; 
Glover, D.M. (ed.), 1985, DNA Cloning: A Practical Approach, MRL Press, Ltd., 
Oxford, U.K. Vol. I, II.) A preferred source is from genomic DNA. Nucleic acid 

35 sequences derived from genomic DNA may contain regulatory regions in addition to 
coding regions. Whatever the source, the isolated CP1, CP2 or CP3 gene should be 
molecularly cloned into a suitable vector for propagation of the gene. 
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In the molecular cloning of the gene from genomic DNA, DNA fragments are 
generated, some of which will encode the desired gene. The DNA may be cleaved at 
specific sites using various restriction enzymes. Alternatively, one may use DNAse in 
the presence of manganese to fragment the DNA, or the DNA can be physically sheared, 

5 as for example, by sonication. The linear DNA fragments can then be separated 
according to size by standard techniques, including but not limited to, agarose and 
polyacrylamide gel electrophoresis and column chromatography. 

Once the DNA fragments are generated, identification of the specific DNA 
fragment containing the CP1, CP2 or CP3 may be accomplished in a number of ways. 

10 For example, a B.subtilis CP1 , CP2 or CP3 gene of the present invention or its 

specific RNA, or a fragment thereof, such as a probe or primer, may be isolated and 
labeled and then used in hybridization assays to detect a gram-positive CP1, CP2 or 
CP3 gene. (Benton, W. and Davis, R., 1977. Science 196 :180; Grunstein, M. And 
Hogness, D., 1975. Proc. Natl. Acad. Sci. USA 72:3961). Those DNA fragments 

15 sharing substantial sequence similarity to the probe will hybridize under stringent 
conditions. 

Accordingly, the present invention provides a method for the detection of 
gram-positive CP1, CP2 and CP3 polynucleotide homologs which comprises 
hybridizing part or all of a nucleic acid sequence of B. subtilis CP1 , CP2 and CP3 

20 with gram-positive microorganism nucleic acid of either genomic or cDNA origin. 

Also included within the scope of the present invention are gram-positive 
microorganism polynucleotide sequences that are capable of hybridizing to the 
nucleotide sequence of B.subtilis CP1, CP2 or CP3 under conditions of intermediate 
to maximal stringency. Hybridization conditions are based on the melting temperature 

25 (Tm) of the nucleic acid binding complex, as taught in Berger and Kimmel (1987, 

Guide to Molecular Cloning Techniques. Methods in Enzvmology. Vol 152. 

Academic Press. San Diego CA) incorporated herein by reference, and confer a 

defined "stringency" as explained below. 

"Maximum stringency" typically occurs at about Tm-5°C (5°C below the Tm 

30 of the probe); "high stringency" at about 5°C to 10°C below Tm; "intermediate 

stringency" at about 10°C to 20°C below Tm; and "low stringency" at about 20°C to 
25°C below Tm. As will be understood by those of skill in the art, a maximum 
stringency hybridization can be used to identify or detect identical polynucleotide 
sequences while an intermediate or low stringency hybridization can be used to 

35 identify or detect polynucleotide sequence homologs. 
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The term "hybridization" as used herein shall include "the process by which a 
strand of nucleic acid joins with a complementary strand through base pairing" 
(Coombs J (1994) Dictionary of Biotechnology, Stockton Press, New York NY). 

The process of amplification as carried out in polymerase chain reaction (PCR) 
5 technologies is described in Dieffenbach CW and GS Dveksler (1995, PCR Primer, a 
Laboratory Manual, Cold Spring Harbor Press, Plainview NY). A nucleic acid 
sequence of at least about 1 0 nucleotides and as many as about 60 nucleotides from B. 
subtilis CP1, CP2 or CP3 preferably about 12 to 30 nucleotides, and more preferably 
about 20-25 nucleotides can be used as a probe or PCR primer. 

10 The B. subtilis amino acid sequences CP1, CP2 and CP3 (shown in figures 2, 4 

and 3, respectively) were identified via a FASTA search of Bacillus subtilis genomic 
nucleic acid sequences. B. subtilis CP1 (YJDE) was identified by its structural 
homology to the cysteine protease papain having the sequence designated 
"papacarpa.p". As shown in Figure 2, YJDE has the motif GXCWAF as well as the 

15 conserved catalytic residues His/Ala and Asn/Ser. CP2 (YdHS) and CP3 (PMI) were 
identified upon their structural homology to CP1 (YJDE). The presence of GXCWAF 
as well as residues His/Ala and Asn/Ser is noted in Figures 3 and 4. CP3 (PMI) was 
previously characterized as a possible phosphomannose isomerase, (Noramata). 
There has been no previous characterization of CP3 as a cysteine protease. 

20 

II. Expression Systems 

The present invention provides host cells, expression methods and systems for 
the enhanced production and secretion of desired heterologous or homologous 
proteins in gram-positive microorganisms. In one embodiment, a host cell is 

25 genetically engineered to have a deletion or mutation in the gene encoding a gram- 
positive CP1 , CP2 or CP3 such that the respective activity is deleted. In another 
embodiment of the present invention, a gram-positive microorganism is genetically 
engineered to produce a cysteine protease of the present invention. 

Inactivation of a gram-positive cysteine protease in a host cell 

30 Producing an expression host cell incapable of producing the naturally 

occurring cysteine protease necessitates the replacement and/or inactivation of the 
naturally occurring gene from the genome of the host cell. In a preferred 
embodiment, the mutation is a non-reverting mutation. 

One method for mutating nucleic acid encoding a gram-positive cysteine 

35 protease is to clone the nucleic acid or part thereof, modify the nucleic acid by site 
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directed mutagenesis and reintroduce the mutated nucleic acid into the cell on a 
plasmid. By homologous recombination, the mutated gene may be introduced into the 
chromosome. In the parent host cell, the result is that the naturally occurring nucleic 
acid and the mutated nucleic acid are located in tandem on the chromosome. After a 

5 second recombination, the modified sequence is left in the chromosome having 

thereby effectively introduced the mutation into the chromosomal gene for progeny of 
the parent host cell. 

Another method for inactivating the cysteine protease proteolytic activity is 
through deleting the chromosomal gene copy. In a preferred embodiment, the entire 

10 gene is deleted, the deletion occurring in such as way as to make reversion impossible. 
In another preferred embodiment, a partial deletion is produced, provided that the 
nucleic acid sequence left in the chromosome is too short for homologous 
recombination with a plasmid encoded cysteine protease gene. In another preferred 
embodiment, nucleic acid encoding the catalytic amino acid residues are deleted. 

15 Deletion of the naturally occurring gram-positive microorganism cysteine 

protease can be carried out as follows. A cysteine protease gene including its 5' and 
3' regions is isolated and inserted into a cloning vector. The coding region of the 
cysteine protease gene is deleted form the vector in vitro, leaving behind a sufficient 
amount of the 5' and 3' flanking sequences to provide for homologous recombination 

20 with the naturally occurring gene in the parent host cell. The vector is then 
transformed into the gram-positive host cell. The vector integrates into the 
chromosome via homologous recombination in the flanking regions. This method 
leads to a gram-positive strain in which the protease gene has been deleted. 

The vector used in an integration method is preferably a plasmid. A selectable 

25 marker may be included to allow for ease of identification of desired recombinant 
microorgansims. Additionally, as will be appreciated by one of skill in the art, the 
vector is preferably one which can be selectively integrated into the chromosome. 
This can be achieved by introducing an inducible origin of replication, for example, a 
temperature sensitive origin into the plasmid. By growing the transformants at a 

30 temperature to which the origin of replication is sensitive, the replication function of 
the plasmid is inactivated, thereby providing a means for selection of chromosomal 
integrants. Integrants may be selected for growth at high temperatures in the presence 
of the selectable marker, such as an antibiotic. Integration mechanisms are described 
in WO 88/06623. 

35 Integration by the Campbell-type mechanism can take place in the 5' flanking 

region of the protease gene, resulting in a protease positive strain carrying the entire 
plasmid vector in the chromosome in the cysteine protease locus. Since illegitimate 
recombination will give different results it will be necessary to determine whether the 



GC381 



GC381 - 10- 



complete gene has been deleted, such as through nucleic acid sequencing or restriction 
maps. 

Another method of inactivating the naturally occurring cysteine protease gene 
is to mutagenize the chromosomal gene copy by transforming a gram-positive 
5 microorganism with oligonucleotides which are mutagenic. Alternatively, the 
chromosomal cysteine protease gene can be replaced with a mutant gene by 
homologous recombination. 

The present invention encompasses host cells having additional protease 
deletions or mutations, such as deletions or mutations in apr, npr, epr, mpr and others 
10 known to those of skill in the art. 

One assay for the detection of mutants involves growing the Bacillus host cell 
on medium containing a protease substrate and measuring the appearance or lack 
thereof, of a zone of clearing or halo around the colonies. Host cells which have an 
inactive protease will exhibit little or no halo around the colonies. 

15 

III. Production of Cysteine Protease 

For production of cysteine protease in a host cell, an expression vector 
comprising at least one copy of nucleic acid encoding a gram-positive microorganism 
CP1, CP2 or CP3, and preferably comprising multiple copies, is transformed into the 

20 host cell under conditions suitable for expression of the cysteine protease. In 

accordance with the present invention, polynucleotides which encode a gram-positive 
microorganism CP1, CP2 or CP3, or fragments thereof, or fusion proteins or 
polynucleotide homolog sequences that encode amino acid variants of B.subtilis CP1, 
CP2 or CP3, may be used to generate recombinant DNA molecules that direct their 

25 expression in host cells. In a preferred embodiment, the gram-positive host cell 

belongs to the genus Bacillus. In another preferred embodiment, the gram positive 
host cell is B. subtilis. 

As will be understood by those of skill in the art, it may be advantageous to 
produce polynucleotide sequences possessing non-naturally occurring codons. Codons 

30 preferred by a particular gram-positive host cell (Murray E et al (1989) Nuc Acids Res 
1 7:477-508) can be selected, for example, to increase the rate of expression or to 
produce recombinant RNA transcripts having desirable properties, such as a longer 
half-life, than transcripts produced from naturally occurring sequence. 

Altered CP1, CP2 or CP3 polynucleotide sequences which may be used in 

35 accordance with the invention include deletions, insertions or substitutions of different 
nucleotide residues resulting in a polynucleotide that encodes the same or a 
functionally equivalent CP1, CP2 or CP3 homolog, respectively. As used herein a 
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"deletion" is defined as a change in either nucleotide or amino acid sequence in which 
one or more nucleotides or amino acid residues, respectively, are absent. 

As used herein an "insertion" or "addition" is that change in a nucleotide or 
amino acid sequence which has resulted in the addition of one or more nucleotides or 
5 amino acid residues, respectively, as compared to the naturally occurring CP1 , CP3 or 
CP3. 

As used herein "substitution" results from the replacement of one or more 
nucleotides or amino acids by different nucleotides or amino acids, respectively. 

The encoded protein may also show deletions, insertions or substitutions of 

10 amino acid residues which produce a silent change and result in a functionally CP1, 
CP2 or CP3 variant. Deliberate amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the 
amphipathic nature of the residues as long as the variant retains the ability to modulate 
secretion. For example, negatively charged amino acids include aspartic acid and 

15 glutamic acid; positively charged amino acids include lysine and arginine; and amino 
acids with uncharged polar head groups having similar hydrophilicity values include 
leucine, isoleucine, valine; glycine, alanine; asparagine, glutamine; serine, threonine, 
phenylalanine, and tyrosine. 

The CP1 , CP2 or CP3 polynucleotides of the present invention may be 

20 engineered in order to modify the cloning, processing and/or expression of the gene 
product. For example, mutations may be introduced using techniques which are well 
known in the art, eg, site-directed mutagenesis to insert new restriction sites, to alter 
glycosylation patterns or to change codon preference, for example. 

In one embodiment of the present invention, a gram-positive microorganism 

25 CP1 , CP2 or CP3 polynucleotide may be ligated to a heterologous sequence to encode 
a fusion protein. A fusion protein may also be engineered to contain a cleavage site 
located between the cysteine protease nucleotide sequence and the heterologous 
protein sequence, so that the cysteine protease may be cleaved and purified away from 
the heterologous moiety. 

30 

IV. Vector Sequences 

Expression vectors used in expressing the cysteine proteases of the present 
invention in gram-positive microorganisms comprise at least one promoter associated 
with a cysteine protease selected from the group consisting of CP1, CP2 and CP3, 
35 which promoter is functional in the host cell. In one embodiment of the present 

invention, the promoter is the wild-type promoter for the selected cysteine protease 
and in another embodiment of the present invention, the promoter is heterologous to 
the cysteine protease, but still functional in the host cell. In one preferred embodiment 
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of the present invention, nucleic acid encoding the cysteine protease is stably 
integrated into the microorganism genome. 

In a preferred embodiment, the expression vector contains a multiple cloning 
site cassette which preferably comprises at least one restriction endonuclease site 
5 unique to the vector, to facilitate ease of nucleic acid manipulation. In a preferred 
embodiment, the vector also comprises one or more selectable markers. As used 
herein, the term selectable marker refers to a gene capable of expression in the gram- 
positive host which allows for ease of selection of those hosts containing the vector. 
Examples of such selectable markers include but are not limited to antibiotics, such 
10 as, erythromycin, actinomycin, chloramphenicol and tetracycline. 

V. Transformation 

A variety of host cells can be used for the production of CP1, 
CP2 and CP3 including bacterial, fungal, mammalian and insects cells. General 
15 transformation procedures are taught in Current Protocols In Molecular Biology (vol. 
1, edited by Ausubel et al., John Wiley & Sons, Inc. 1987, Chapter 9) and include 
calcium phosphate methods, transformation using DEAE-Dextran and electroporation. 
Plant transformation methods are taught in Rodriquez (WO 95/14099, published 26 
May 1995). 

20 In a preferred embodiment, the host cell is a gram-positive microorganism and 

in another preferred embodiment, the host cell is Bacillus. In one embodiment of the 
present invention, nucleic acid encoding one or more cysteine protease(s) of the 
present invention is introduced into a host cell via an expression vector capable of 
replicating within the Bacillus host cell. Suitable replicating plasmids for Bacillus are 

25 described in Molecular Biological Methods for Bacillus, Ed. Harwood and Cutting, 
John Wiley & Sons, 1990, hereby expressly incorporated by reference; see chapter 3 
on plasmids. Suitable replicating plasmids for B. subtilis are listed on page 92. 

In another embodiment, nucleic acid encoding a cysteine protease(s) of the 
present invention is stably integrated into the microorganism genome. Preferred host 

30 cells are gram-positive host cells. Another preferred host is Bacillus. Another 
preferred host is Bacillus subtilis. Several strategies have been described in the 
literature for the direct cloning of DNA in Bacillus. Plasmid marker rescue 
transformation involves the uptake of a donor plasmid by competent cells carrying a 
partially homologous resident plasmid (Contente et al., Plasmid 2:555-571 (1979); 

35 Haima et al., Mol. Gen. Genet. 223:185-191 (1990); Weinrauch et al., J. Bacteriol. 
154(3):1077-1087 (1983); and Weinrauch et al., J. Bacteriol. 1 69(3): 1 205-1 21 1 
(1987)). The incoming donor plasmid recombines with the homologous region of the 
resident "helper" plasmid in a process that mimics chromosomal transformation. 
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Transformation by protoplast transformation is described for B. subtilis in 
Chang and Cohen, (1979) Mol. Gen. Genet 168:1 1 1-1 15; for B.megaterium in 
Vorobjeva et al., (1980) FEMS Microbiol. Letters 7:261-263; for B. 
amyloliquefaciens in Smith et al., (1986) Appl. and Env. Microbiol. 51 :634; for 
5 B.thuringiensis in Fisher et al., (1981) Arch. Microbiol. 139:213-217; for B.sphaericus 
in McDonald (1984) J. Gen. Microbiol. 130:203; and B.larvae in Bakhiet et al., (1985) 
49:577. Mann et al., (1986, Current Microbiol. 13:131-135) report on transformation 
of Bacillus protoplasts and Holubova, (1985) Folia Microbiol. 30:97) disclose 
methods for introducing DNA into protoplasts using DNA containing liposomes. 

10 

VI. Identification of Transformants 

Whether a host cell has been transformed with a mutated or a naturally 
occurring gene encoding a gram-positive CP1, CP2 or CP3, detection of the 
presence/absence of marker gene expression can suggests whether the gene of interest 

15 is present However, its expression should be confirmed. For example, if the nucleic 
acid encoding a cysteine protease is inserted within a marker gene sequence, 
recombinant cells containing the insert can be identified by the absence of marker 
gene function. Alternatively, a marker gene can be placed in tandem with nucleic acid 
encoding the cysteine protease under the control of a single promoter. Expression of 

20 the marker gene in response to induction or selection usually indicates expression of 
the cysteine protease as well. 

Alternatively, host cells which contain the coding sequence for a cysteine 
protease and express the protein may be identified by a variety of procedures known to 
those of skill in the art. These procedures include, but are not limited to, DNA-DNA 

25 or DNA-RNA hybridization and protein bioassay or immunoassay techniques which 
include membrane-based, solution-based, or chip-based technologies for the detection 
and/or quantification of the nucleic acid or protein. 

The presence of the cysteine polynucleotide sequence can be detected by 
DNA-DNA or DNA-RNA hybridization or amplification using probes, portions or 

30 fragments of B.subtilis CP1, CP2 or CP3. 

VII Assay of Protease Activity 

There are various assays known to those of skill in the art for detecting and 
measuring protease activity. There are assays based upon the release of acid-soluble 
35 peptides from casein or hemoglobin measured as absorbance at 280 nm or 

colorimetrically using the Folin method (Bergmeyer, et al., 1984, Methods of 
Enzymatic Analysis vol. 5, Peptidases, Proteinases and their Inhibitors, Verlag 
Chemie, Weinheim). Other assays involve the solubilization of chromogenic 
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substrates (Ward, 1983, Proteinases, in Microbial Enzymes and Biotechnology (W.M. 
Fogarty, ed.), Applied Science, London, pp. 251-317). 

VIII Secretion of Recombinant Proteins 

5 Means for determining the levels of secretion of a heterologous or homologous 

protein in a gram-positive host cell and detecting secreted proteins include, using 
either polyclonal or monoclonal antibodies specific for the protein. Examples include 
enzyme-linked immunosorbent assay (ELISA), radioimmunoassay (RIA) and 
fluorescent activated cell sorting (FACS). These and other assays are described, 

10 among other places, in Hampton R et al (1990, Serological Methods, a Laboratory 
Manual, APS Press, St Paul MN) and Maddox DE et al (1983, J Exp Med 158:121 1). 

A wide variety of labels and conjugation techniques are known by those 
skilled in the art and can be used in various nucleic and amino acid assays. Means for 
producing labeled hybridization or PCR probes for detecting specific polynucleotide 

15 sequences include oligolabeling, nick translation, end-labeling or PCR amplification 
using a labeled nucleotide. Alternatively, the nucleotide sequence, or any portion of 
it, may be cloned into a vector for the production of an mRNA probe. Such vectors 
are known in the art, are commercially available, and may be used to synthesize RNA 
probes in vitro by addition of an appropriate RNA polymerase such as T7, T3 or SP6 

20 and labeled nucleotides. 

A number of companies such as Pharmacia Biotech (Piscataway NJ), Promega 
(Madison WI), and US Biochemical Corp (Cleveland OH) supply commercial kits and 
protocols for these procedures. Suitable reporter molecules or labels include those 
radionuclides, enzymes, fluorescent, chemiluminescent, or chromogenic agents as 

25 well as substrates, cofactors, inhibitors, magnetic particles and the like. Patents 

teaching the use of such labels include US Patents 3,817,837; 3,850,752; 3,939,350; 
3,996,345; 4,277,437; 4,275,149 and 4,366,241. Also, recombinant immunoglobulins 
may be produced as shown in US Patent No. 4,816,567 and incorporated herein by 
reference. 

30 

IX Purification of Proteins 

Gram positive host cells transformed with polynucleotide sequences encoding 
heterologous or homologous protein may be cultured under conditions suitable for the 
expression and recovery of the encoded protein from cell culture. The protein 
35 produced by a recombinant gram-positive host cell comprising a mutation or deletion 
of the cysteine protease activity will be secreted into the culture media. Other 
recombinant constructions may join the heterologous or homologous polynucleotide 



GC381 



GC381 - 15- 

sequences to nucleotide sequence encoding a polypeptide domain which will facilitate 
purification of soluble proteins (Kroll DJ et al (1993) DNA Cell Biol 12:441-53). 

Such purification facilitating domains include, but are not limited to, metal 
chelating peptides such as histidine-tryptophan modules that allow purification on 

5 immobilized metals (Porath J (1992) Protein Expr Purif 3:263-281), protein A 
domains that allow purification on immobilized immunoglobulin, and the domain 
utilized in the FLAGS extension/affinity purification system (Immunex Corp, Seattle 
WA). The inclusion of a cleavable linker sequence such as Factor XA or enterokinase 
(Invitrogen, San Diego CA) between the purification domain and the heterologous 

10 protein can be used to facilitate purification. 

X Uses of The Present Invention 

CP1, CP2 and CP3 and Genetically Engineered Host Cells 
The present invention provides genetically engineered host cells comprising 
15 preferably non-re vertable mutations or deletions in the naturally occurring gene 

encoding CP1, CP2 or CP3 such that the proteolytic activity is diminished or deleted 
altogether. The host cell may contain additional protease deletions, such as deletions 
of the mature subtilisn protease and/or mature neutral protease disclosed in United 
States Patent No. 5,264,366. 
20 In a preferred embodiment, the host cell is further genetically engineered to 

produce a desired protein or polypeptide. In a preferred embodiment the host cell is a 
Bacillus. In another preferred embodiment, the host cell is a Bacillus subtilis. 

In an alternative embodiment, a host cell is genetically engineered to produce a 
gram-positive CP1, CP2 or CP3. In a preferred embodiment, the host cell is grown 
25 under large scale fermentation conditions, the CP1, CP2 or CP3 is isolated and/or 
purified and used in cleaning compositions such as detergents. WO 95/10615 
discloses detergent formulation. 

CPK CP2 and CP3 Polynucleotides 

A B.subtlis polynucleotide, or any part thereof, provides the basis for detecting 
30 the presence of gram-positive microorganism polynucleotide homologs through 
hybridization techniques and PCR technology. 

Accordingly, one aspect of the present invention is to provide for nucleic acid 
hybridization and PCR probes which can be used to detect polynucleotide sequences, 
including genomic and cDNA sequences, encoding gram-positive CP1, CP2 or CP3 or 
35 portions thereof. 

The manner and method of carrying out the present invention may be 
more fully understood by those of skill in the art by reference to the following 
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examples, which examples are not intended in any manner to limit the scope of the 
present invention or of the claims directed thereto 

Example I 
5 Preparation of a Genomic library 

The following example illustrates the preparation of a Bacillus genomic 

library. 

Genomic DNA from Bacillus cells is prepared as taught in Current Protocols 
In Molecular Biology vol. 1, edited by Ausubel et al., John Wiley & Sons, Inc. 1987, 

10 chapter 2. 4.1 . Generally, Bacillus cells from a saturated liquid culture are lysed and 
the proteins removed by digestion with proteinase K. Cell wall debris, 
polysaccharides, and remaining proteins are removed by selective precipitation with 
CTAB, and high molecular weight genomic DNA is recovered from the resulting 
supernatant by isopropanol precipitation. If exceptionally clean genomic DNA is 

15 desired, an additional step of purifying the Bacillus genomic DNA on a cesium 
chloride gradient is added. 

After obtaining purified genomic DNA, the DNA is subjected to Sau3A 
digestion. Sau3A recognizes the 4 base pair site GATC and generates fragments 
compatible with several convenient phage lambda and cosmid vectors. The DNA is 

20 subjected to partial digestion to increase the chance of obtaining random fragments. 

The partially digested Bacillus genomic DNA is subjected to size fractionation 
on a 1% agarose gel prior to cloning into a vector. Alternatively, size fractionation on 
a sucrose gradient can be used. The genomic DNA obtained from the size 
fractionation step is purified away from the agarose and ligated into a cloning vector 

25 appropriate for use in a host cell and transformed into the host cell. 

Example II 

Detection of gram-postive microorganisms 

The following example describes the detection of gram-positive 
30 microorganism CP1 . The same procedures can be used to detect CP2 and CP3. 

DNA derived from a gram-positive microorganism is prepared according to 
the methods disclosed in Current Protocols in Molecular Biology, Chap. 2 or 3. The 
nucleic acid is subjected to hybridization and/or PCR amplification with a probe or 
primer derived from CP1 . A preferred probe comprises the nucleic acid section 
35 containing the conserved motif GXCWAF. 

The nucleic acid probe is labeled by combining 50 pmol of the nucleic acid 
and 250 mCi of [gamma ^2p] adenosine triphosphate (Amersham, Chicago IL) and 
T4 polynucleotide kinase (DuPont NEN®, Boston MA). The labeled probe is purified 
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with Sephadex G-25 super fine resin column (Pharmacia). A portion containing 1 

counts per minute of each is used in a typical membrane based hybridization analysis 

of nucleic acid sample of either genomic or cDNA origin. 

The DNA sample which has been subjected to restriction endonuclease 

5 digestion is fractionated on a 0.7 percent agarose gel and transferred to nylon 
membranes (Nytran Plus, Schleicher & Schuell, Durham NH). Hybridization is 
carried out for 1 6 hours at 40 degrees C. To remove nonspecific signals, blots are 
sequentially washed at room temperature under increasingly stringent conditions up to 
0.1 x saline sodium citrate and 0.5% sodium dodecyl sulfate. The blots are exposed 

10 to film for several hours, the film developed and hybridization patterns are compared 
visually to detect polynucleotide homologs of B.subtilis CP1 . The homologs are 
subjected to confirmatory nucleic acid sequencing. Methods for nucleic acid 
sequencing are well known in the art. Conventional enzymatic methods employ DNA 
polymerase Klenow fragment, SEQUENASE® (US Biochemical Corp, Cleveland, 

15 OH) or Taq polymerase to extend DNA chains from an oligonucleotide primer 
annealed to the DNA template of interest. 

Various other examples and modifications of the foregoing description and 
examples will be apparent to a person skilled in the art after reading the disclosure 
without departing from the spirit and scope of the invention, and it is intended that all 

20 such examples or modifications be included within the scope of the appended claims. 
All publications and patents referenced herein are hereby incorporated by reference in 
their entirety. 
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